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ABSTRACT 

For many users on social networks, one of the goals when 
broadcasting content is to reach a large audience. The prob¬ 
ability of receiving reactions to a message differs for each 
user and depends on various factors, such as location, daily 
and weekly behavior patterns and the visibility of the mes¬ 
sage. While previous work has focused on overall network 
dynamics and message flow cascades, the problem of recom¬ 
mending personalized posting times has remained an under¬ 
explored topic of research. 

In this study, we formulate a when-to-post problem, where 
the objective is to find the best times for a user to post on 
social networks in order to maximize the probability of audi¬ 
ence responses. To understand the complexity of the prob¬ 
lem, we examine user behavior in terms of post-to-reaction 
times, and compare cross-network and cross-city weekly re¬ 
action behavior for users in different cities, on both Twitter 
and Facebook. We perform this analysis on over a billion 
posted messages and observed reactions, and propose mul¬ 
tiple approaches for generating personalized posting sched¬ 
ules. We empirically assess these schedules on a sampled 
user set of 0.5 million active users and more than 25 mil¬ 
lion messages observed over a 56 day period. We show that 
users see a reaction gain of up to 17% on Facebook and 4% 
on Twitter when the recommended posting times are used. 

We open the dataset used in this study, which includes 
timestamps for over 144 million posts and over 1.1 billion 
reactions. The personalized schedules derived here are used 
in a fully deployed production system to recommend posting 
times for millions of users every day. 
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1. INTRODUCTION 

Social networks have emerged as major platforms for com¬ 
munication in recent years, with hundreds of millions of in¬ 
teractions created by users every day. Though the underly¬ 
ing mechanisms may vary, a large number of active interac¬ 
tions may be classified under (a) users posting messages, or 


(b) users reacting to messages. Posted messages may some¬ 
times be intended for a few friends and family members, 
while other times they may be geared towards larger audi¬ 
ences. The latter is especially true for users such as brands, 
marketers and public figures, who leverage social media as 
platforms for broadcasting messages. 

One of the goals while broadcasting messages is to capture 
the attention of audience members so that they may react 
to the posted message. The probability that an audience 
member reacts to a message may depend on several factors, 
such as his daily and weekly behavior patterns, his location 
or timezone, and the volume of other messages competing 
for his attention. The problem of broadcasting messages at 
the right time in order to elicit responses from one’s audience 
is therefore a complex one with many dimensions. 

A large body of research in this area has focused on the 
problem of influence maximization and related topics, where 
the goal is to target a specific subset of users in order to cre¬ 
ate information cascades in the network. However, the dy¬ 
namics of broadcasting to entire audiences, rather than pick¬ 
ing specific individuals to target, has been an under-explored 
topic of study. Further, since each user has a unique audi¬ 
ence, any recommendations for posting times need to be 
personalized to be effective, as we show in this study. We 
hence formulate a when-to-post problem here, where the ob¬ 
jective is to find the best times for a user to post on social 
networks in order to increase audience responses. 

Apart from introducing the problem, our contributions 
in this work are three-fold. First, in order to understand 
the complexity of the when-to-post problem and the factors 
that affect it, we perform in-depth user reaction behavior 
analysis, which includes: 

1. Post-to-reaction behavior: We analyze the delays be¬ 
tween posting and reaction times across different social 
networks and user in-degrees. 

2. Cross-network analysis: We examine the similarities 
and differences of audience behavior on Twitter and 
Facebook. 

3. Cross-city analysis: We compare cycles of daily and 
weekly user activity in different cities, and present 
analysis on how location affects posting schedules. 

Second, we formally define the when-to-post problem in 
a probabilistic setting, and propose multiple approaches for 
recommending personalized posting schedules. Among 
these are the First-Degree and the Second-Degree schedules, 
and their corresponding weighted counterparts. We empiri¬ 
cally assess these schedules against two global baselines, on 


a real-world set of 0.5 million active users observed over a 56 
day period. We define a metric called Reaction Gain that 
helps us evaluate the effectiveness of the two approaches, 
and show that users see an average reaction gain of up to 
17% for Facebook and upto 4% for Twitter. 

Third, we open a public dataset consisting of anonymized 
user ids and timestamp data that could help future research 
in this area. This dataset contains timestamps for 144 mil¬ 
lion posts and 1.1 billion reactions from a 120-day period. 

We performed our study and analysis on a full produc¬ 
tion system deployed on klout.com. Klout 1 is a social media 
platform that aggregates and analyzes data from social net¬ 
works [14] such as Twitter, Facebook, Google+ and others. 
Our system recommends personalized posting schedules for 
millions of users to share content on Twitter and Facebook. 

2. RELATED WORK 

The subject of user behavior dynamics on social networks 
has attracted significant research attention [10, 6, 2]. Wu 
et al. [16] categorized Twitter users into elite and casual 
users and analyzed the differences in how they generate and 
consume information. In their study, they showed that re¬ 
gardless of the type of content, all content had very short life 
spans that usually dropped exponentially after a day. An¬ 
other study in [1] also showed that only a few topics lasted 
for a long time on social media platforms, while most topics 
faded away quickly in the order of 20-40 minutes. 

Besides the life span of messages, researchers have also an¬ 
alyzed the effects of timezone and location on user activity 
patterns. Kwak et al. [9] analyzed the timezone character¬ 
istics of user audiences on Twitter and reported that the 
average timezone difference between a user and her friends 
varied with the number of friends. In our study, we further 
analyze the impact of audience location on the volume of 
responses towards a message. 

There have been several studies on modeling the dynamics 
of social network events [12, 15]. For example, the work in 
[15] used different convolution functions to analyze the flow 
of news events and sentiments through Twitter. While the 
approach of these studies has been to analyze the overall 
temporal characteristics on social media, here we take the 
further step of analyzing reaction behavior from the point of 
view of each individual user, thereby enabling personalized 
recommendations for posting messages. 

Another line of related research is in the area of infor¬ 
mation flow and diffusion. Studies such as [11, 13, 5] have 
analyzed how factors such as the topological structure of so¬ 
cial networks play a role in information cascades. Yang et 
al. [17] presented results on analyzing message flow based 
on Twitter mentions, and found that long-term historical 
user properties such as the rate of previous mentions were 
as important as the tweet content. The authors in [18] stud¬ 
ied the importance of hashtag adoption in determining the 
popularity and spread of tweets. The study in [7] proposed 
a predictive approach to model dynamics of diffusion in so¬ 
cial networks based on social, semantic and temporal di¬ 
mensions. However, the problem of examining the flow of 
messages in the entire network differs significantly from the 
one in our study. Here we are instead concerned with the 
reactions received by a single user in a short time window. 


1 Klout platform is a part of Lithium Technologies, Inc. 


A large body of research has also focused on influence 
maximization [8, 4, 3], which also differs from the when-to- 
post problem. Influence maximization aims to find a subset 
of users in a social network, such that targeting them with 
a message maximizes the propagation or adoption of the 
message throughout the network. However, the effects of 
broadcasting messages to entire audiences, rather than tar¬ 
geting specific individuals, has not been as well studied. It is 
this problem that we propose and analyze here, by examin¬ 
ing the temporal aspects of broadcasting to one’s audience, 
in order to get a large volume of responses. 

3. PROBLEM SETTING 

In this section, we formulate the when-to-post problem 
and provide details about the system and dataset used. 

3.1 Problem Statement 

The actions taken on any social networking site may be 
categorized as passive or active in nature. The passive cat¬ 
egory may include actions such as views, while the active 
category may broadly be classified into two groups - post 
and reaction. Typical post behavior may include creating 
and sending messages, sharing photos, or posting news arti¬ 
cles on a social network. Typical reaction behavior includes 
resharing, liking, commenting, endorsing or replying to posts 
created by other users. We restrict the scope of this study 
to the post and reaction behavior of users. 

Sometimes the post behavior is used in the context of one- 
on-one or personal communication, while other times it may 
be geared towards a larger audience. Here we focus on the 
latter case, where one of the motivations behind posting is 
to reach a large audience and to capture their attention. 
In particular, we examine the time-related aspects of this 
behavior and frame a when-to-post problem as follows: 

Problem Statement: For a user on a social network, 
find the best time to post a message within a specified 
time period in order to maximize the probability of receiving 
audience reactions. 

Note that we only consider first-degree reactions such as 
replies and retweets on Twitter and comments on Facebook, 
and not those caused by an audience member resharing the 
original post. In other words, we focus mainly on the reac¬ 
tions a post receives by the user’s immediate audience, and 
not on how the post propagates through the network. 

3.2 System Overview 

We collect user posts from Facebook through the oauth- 
token provided by registered users on Klout. We also use 
the oauth-token-based approach to collect the friend graph 
of users on Facebook and the follower graph for users on 
Twitter. Klout partners with GNIP to collect public data 
generated in the Twitter Mention Stream 2 . For location 
analysis, we use the city, state and country information pro¬ 
vided by registered users on the Klout application. 

The collected data is written out to a Hadoop cluster 3 that 
uses HDFS as the Hie system, HBase as the serving data- 
store, and Hive 4 to process, query and manage the large 
datasets. We implement independent Java utilities with 

2 https://gnip.com/sources/twitter 
3 http://wiki.apache.org/hadoop/ 

4 http://hive.apache.org/ 



Hive UDF (User Defined Function) wrappers, with func¬ 
tions to process user locations and timezones, and operators 
such as discrete convolution to process time-series vectors. 
The combination of Hive Query Language and UDFs allows 
us to build map-reduce jobs that can scale up to analyze 
billions of messages posted to social platforms every day. 
A pipeline run on a 150-node cluster has a cumulative I/O 
footprint of 224GB of reads, 78GB of writes, and 9.62 days 
of CPU usage. Fig. 1 shows an overview of the system. 



3.3 Dataset 

The dataset used to run experiments and build models has 
been opened at https: //github. com/klout/opendata. The 
corpus has event timestamps for posts that were created be¬ 
tween October 15, 2014 and February 11, 2015 and received 
at least one reaction. The dataset was generated from more 
than 1 million users apiece from Facebook and Twitter, 
with accounts registered on Klout.com. For Facebook the 
dataset includes more than 25 million post timestamps and 
104 million reaction timestamps, while for Twitter these 
numbers are 119 million and 1 billion respectively. In or¬ 
der to preserve privacy, timestamps were slightly perturbed 
and user and post ids were anonymized using custom fin¬ 
gerprint functions. 

4. BEHAVIOR ANALYSIS 

In this section we perform in-depth user behavior anal¬ 
ysis across temporal and local dimensions, such as post- 
to-reaction delay, user location and the network of activ¬ 
ity. This analysis provides some interesting observations and 
valuable insights into the when-to-post problem. 

4.1 Post to Reaction Time Analysis 

To start with, we note that there is always an inherent 
delay between when a post was created and when a user 
reacts to it. This delay is crucial to consider when we study 
the when-to-post problem. 

Specifically, we are concerned with the post-to-reaction 
delay within a short time window, and we choose this win¬ 
dow to be 24 hours. This is also in accordance with previous 
studies such as [16] that have shown that messages on social 
media are short-lived with exponential dropoff after a day. 
In the limiting case when there is no dropoff and the de¬ 
lay is infinite all posts have the same probability of getting 
responses. Thus it is because of this dropoff within a finite 
duration that the when-to-post problem becomes important. 




Networks 

Tlih ( P) 

TW 

FB 

FP 

GP 

0.25 

00:03 

00:25 

00:31 

00:35 

0.50 

00:24 

01:42 

02:12 

02:19 

0.75 

02:24 

05:65 

07:26 

07:36 

0.90 

08:53 

13:14 

14:57 

15:16 


Audience In-Degree (Twitter) 

724h (p) 

10-100 

100-1K 

10K-100K 

1M-10M 

0.25 

00:08 

00:03 

00:03 

00:06 

0.50 

00:41 

00:20 

00:20 

01:48 

0.75 

02:53 

01:58 

03:11 

07:52 

0.90 

08:49 

07:50 

11:22 

16:26 


Table 1: Tm.h{p), Post-to-Reaction Times [hh:mm] 
Further, since most reactions occur in narrow time windows 
for both networks, the goal should be to recommend post¬ 
ing times in narrow time buckets. To examine the speed of 
reactions, we define a metric Td(p) as follows: 

Definition 1. Let R be the total number of reactions re¬ 
ceived by all posts within a time period d since posting time. 
Then Td(p) is defined as the amount of time that passes be¬ 
tween posting time and the time when the cumulative reac¬ 
tion count is equal to a fraction p of R. □ 

Along with the reaction counts, we use this metric Tk(p) 
to further analyze post-to-reaction behavior across differ¬ 
ent dimensions of the problem. Fig. 2 plots the fraction 
of cumulative reaction counts occurring within 24 hours of 
posting and Table 1 shows the T 2 . 4 h.ip) values respectively. 

Further, we would also like to understand the probability 
distribution of a reaction occurring within a given time win¬ 
dow since the time of post creation. In order to do this, we 
define a Post-to-Reaction Filter function as follows: 

Definition 2. Post To Reaction Filter For a time in¬ 
terval d, the post-to-reaction filter function PTR(d) is de¬ 
fined as a discrete probability distribution over the event that 
a reaction occurs within time d of creating a post. □ 

We estimate the post-to-reaction filter function PTR(d) 
by aggregating reaction times across all observed messages 
and reactions in a network. This filter function will be used 
in Sec. 5 when we derive personalized user schedules. 

4.1.1 Reaction Times By Network 

Posting and reaction behavior varies on social networks 
because of many factors, such as manner of posting, pre¬ 
sentation of posts to users and the set of possible reactions 





















































































that a user can perform. We compare post-to-reaction times 
across three major social networks - Twitter (TW), Face- 
book (FB) and Google-I- (GP). We also treat Facebook Fan 
Pages (FP) as a separate network, since the dynamics of 
posting and reacting on these pages diverge significantly 
from personal Facebook pages. The top halves of Fig. 2 
and Table 1 show the reaction times for different networks. 

We observe that Twitter exhibits a much higher speed of 
reactions compared to the Facebook. On Twitter, 25% of 
the reactions take place in the first 3 minutes, 50% within 
the first half hour, and 90% within the first 9 hours. Other 
networks exhibit slightly slower speeds compared to Twit¬ 
ter, with 50% of reactions on Facebook, Facebook Pages 
and Google-|- taking place within the first 2 hours of post¬ 
ing. Interestingly, we see that the Facebook Pages network 
shows more similar reaction times to Google+ rather than 
Facebook, indicating that similar responses can be elicited 
from users belonging to completely disjoint user sets, if the 
underlying dynamics of interactions are similar. 

In the rest of this paper, we mainly focus on Twitter 
and Facebook, which show significant variations in post-to- 
reaction delays. The distribution of post-to-reaction delay 
for Twitter is narrower and falls off more quickly compared 
to Facebook. The 724ft(p) values in Table 1 suggest that a 
15 minute bucket can capture the necessary granularity of 
reactions, which we choose as the length of our time buckets. 

These variations also highlight that social networks op¬ 
erate on different timescales, and the post-to-reaction filter 
function needs to be computed separately for each network 
during comparison. Next, we consider the dependence of re¬ 
action behavior on the in-degree of users posting messages. 

4.1.2 Reaction Times By User In-Degrees 

Next, we explore the hypothesis that network sizes of users 
may be a factor that affects reaction times. To do so, we 
analyze how an audience member’s in-degree affects his re¬ 
action behavior. Fig. 2 (bottom) plots the fractions of 24 
hour reaction counts against the time elapsed, for different 
sets of in-degrees of audience members on Twitter. Table 1 
(bottom) shows the reaction times at various 724ft, (p) values. 

We find that a large section of audience members with in¬ 
degrees between 100 to 100k exhibit similar behavior. More 
than 60% of the reactions from such users are created in the 
first 1 hour. Users with low in-degrees between 10-100 have 
slower response times, perhaps they may not be very active 
users. The users with in-degrees of greater than 1M have 
the slowest reaction times among all users. This may be at¬ 
tributed to such users being celebrities and brands who may 
not react to messages as quickly as other users do, because 
of the large volume of messages they see. 

Thus, a large portion of audience members show simi¬ 
lar reaction behavior, though they may have differing in¬ 
degrees. We can therefore infer that the when-to-post prob¬ 
lem does not have a large dependency on the network sizes of 
audience members, unless these sizes are very small or very 
large. This permits us to use a common post-to-reaction 
filter function for all users in a given network. 

4.2 Network and Location Analysis 

User post and reaction behaviors are multi-dimensional 
and are highly dependent on the location, network and time- 
zone of the user. In this section, we analyze normalized ag¬ 
gregated user audience reaction behaviors S(u), for user co¬ 


horts within and across various cities as well as across Face- 
book and Twitter within a given city. For behavior analysis 
we use correlation and cosine similarity metrics. Correla¬ 
tion and cosine similarity between finite time series S(u i) 
and S(u 2 ) are defined in Equations 1 and 2 respectively. 

Cosine similarity reveals the overlap between time series, 
while correlation reveals closeness in time dependent pat¬ 
terns between them. We observe metric distributions for 10 
to 50 million user pairs, depending on the cohorts compared, 
where ui is selected from the first cohort and U 2 from the 
second. In addition to the metrics above, we compare the 
raw time series to gain further insights into reaction behav¬ 
iors in Figs. 4 and 5. 
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4.2.1 Network Level Analysis 

In this section, we analyze the user reaction profiles across 
Twitter and Facebook for users in New York City (NYC). 
Fig. 3 top shows expected audience reactions, aggregated 
across all users in NYC. 

We observe that the daily seasonality is more pronounced 
for Twitter than Facebook, with taller peaks and deeper 
troughs. Twitter usage seems to peak during working hours 
and drops quickly thereafter. Both networks also exhibit 
secondary peaks at around 7-8pm daily. The amplitude of 
expected reactions on Twitter is around twice that of Face- 
book’s, meaning posting on Twitter at the right times can 
lead to comparatively larger gains. Also, compared to Twit¬ 
ter, Facebook usage is more consistent throughout the day. 

With respect to weekly trends, we find that Twitter activ¬ 
ity falls to almost half of its weekday amplitude on Saturday 
and Sunday, whereas Facebook activity seems to be less af¬ 
fected by weekends. It is interesting to note that Facebook 
is most consistently used throughout the day on Sundays. 
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Figure 3: Top: Per-Network Globally Aggregated User Audience 
Reaction Behaviors. Bottom: Distribution of Cross-Network Co¬ 
sine Similarity and Correlation Calculated Per-User. Both: All 
data plotted for users in New York City. 























We compare aggregated user audience reaction behaviors 
Sfb(ui) and <Stw(mi) for Facebook and Twitter respec¬ 
tively using Eq. 1 and 2 in Fig. 3 bottom. We observe 
that correlation is positive, and relatively uniform in the 
0.3 — 0.8 range, which means that daily audience patterns 
across Twitter and Facebook are only moderately correlated. 
Both the similarity and correlation curves suggests that al¬ 
though audience reactions exhibit some similarity and cor¬ 
relation across networks for a given user, there are still sig¬ 
nificant differences. This again reinforces the need for any 
recommended schedules to be personalized per network. 

4.2.2 Cross-City Analysis 

In this section we analyze differences in behavior for mul¬ 
tiple cities across Facebook and Twitter. Figs 4a and 5a 
show reaction behaviors, shifted to the local timezone of the 
city, for Facebook and Twitter respectively. 

Observing the Facebook reactions in Fig. 4a, we notice 
that the US cities of San Francisco and New York exhibit 
similar shapes, where reactions peak at the beginning of 
work hours. For Paris, the reactions peak in the second 
half of working hours, while for London most reactions are 
expected towards the end of working hours. Finally, the 
pattern for Tokyo is quite different from the rest with two 
peaks, both occurring off working hours. 

The Twitter reactions in Fig. 5a have similar patterns 
as Facebook. The notable difference is that Twitter reac¬ 
tions for US cities have more pronounced daily peaks, while 
for London, Paris and Tokyo the behavior seems more con¬ 
sistent throughout the day. All the curves show significant 
drops on weekends, and Saturday has noticeably lower activ¬ 
ity than Sunday. We also observe that New York schedules 
lag slightly as compared to San Francisco, which may be 
explained due to lifestyle differences in the two cities. 

In addition to the visual analysis, we also analyze simi¬ 
larity and correlations for reaction behaviors between cities, 
calculated according to Eq. 1 and 2. The time series com¬ 
pared in this case are the reactions aggregated across users 
in two cities, denoted by 5(Ci) and S(C 2 ). Figures 4b-4e, 5b- 
5e show these distributions for Facebook and Twitter within 
the same city and across different cities. 

Interestingly in US cities (New York and San Francisco) 
cross-city correlation and similarity for both Facebook and 
Twitter are not very different from their within city metrics. 
Globally Twitter reaction behavior compared to Facebook 
seems to be more correlated and similar. On Facebook, be¬ 
havior correlation and similarity within city are lowest for 
London and Tokyo, and have high deviation. This indicates 
that users within these cities exhibit more diverse behavior 
patterns compared to US cities. Therefore a city level model 
built for London may not apply to all users within the city. 

5. PERSONALIZED SCHEDULES 

The analysis in the previous section highlights the im¬ 
portance of having personalized posting schedules. Here we 
present multiple approaches to derive such schedules. 

5.1 Notation and Definitions 

To start with, we simplify the computation by bucketizing 
time within a period P into discrete time intervals U. Based 
on the analysis in Sec. 4, we use 15 minute time intervals 
within a period of one week for a total of 4 x 24 x 7 = 672 
buckets, though the methods described here are applicable 


to any time interval and period. Because the number of 
reactions in one bucket in each period is usually small for 
most users, we aggregate the actions from multiple periods 
into the same bucket. For example, all the actions taken by 
a user between 00:00 to 00:15 on Mondays, in a 90 day time 
window, will be grouped into the first bucket t\. 

We also define the following sets associated with a user: 

Definition 3. For a useru, the set U 0 ut(u) is defined as 
the set of all users who are connected to u, and can poten¬ 
tially react to the posts created by u. 

Definition 4. For a useru, the set Ui n {u) is defined as 
the set of all users to whom u is connected, and whose posts 
can be potentially be reacted upon by u. 

Note that though we treat the above sets as separate enti¬ 
ties in order to differentiate between the post and reaction 
behavior, we do not assume that they are disjoint sets. 5 

Let N be the number of time buckets within the time pe¬ 
riod P under consideration. To represent the actions associ¬ 
ated with a user with respect to time, we create time-based 
action profiles for each user computed from a user’s actions 
in the period P, and aggregated into the buckets tk- These 
profiles can thus be represented as vectors of length N. 

We define four primary action profiles for each user: 

• First, for each user u, we define a Created Posts 
profile C(u ) that represents the posts created by the 
user in each time bucket. 

• Inversely we can also define a Visible Posts pro¬ 
file V(«), which represents the potentially reactionable 
posts from Ui„(u) that are visible to the user. 

• Based on the posts that a user sees, he may respond 
to them in some manner. We can represent these re¬ 
sponses as a Self Reaction profile lZ(u) for the user. 

• Finally, we define an Estimated Audience Reac¬ 
tion profile Q(u) that estimates the number of reac¬ 
tions received by the user from his audience U 0 ut(u) in 
each time bucket. 

As noted in previous works such as [1] and [15], and as 
analyzed in Sec. 4, there is usually a time difference between 
when a post is created by a user in Ui n (u), and when the 
user u may react to it. Thus a specific post may be visible in 
the time bucket tk in V(u), but may only be reacted upon in 
a later time bucket ty in 7 Z(u). The post-to-reaction filter 
function defined in the previous section represents this lag 
in terms of a time interval d, discretized into time buckets 
of size tk- We can therefore compute a Delayed Reac¬ 
tion Profile for a user by performing a discrete convolution 
operation of the original reaction profile with the post-to- 
reaction filter function. 

ll d (u) = FL{u) * PTR(d) (3) 

where * is the discrete convolution operator. 6 

Each element rd,k(u) in the delayed reaction profile repre¬ 
sents the number of reactions that the user u would generate 

5 For some bi-directional relationships such as Facebook 
Friends, U ou t{u) and Ui„(u ) are equivalent. 

®For two functions /, g defined on the set of integers Z, the 
discrete convolution of f, g is given by: 
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(b) Same-City Correlation (c) Same-City Similarity (d) Cross-City Correlation 




0.3 0.4 0.5 0.1 

Similarity 


(e) Cross-City Correlation 


Figure 4: Facebook - City-Level Reaction Behavior 




(b) Same-City Correlation 


(c) Same-City Similarity (d) Cross-City Correlation 
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Figure 5: Twitter - City-Level Reaction Behavior 






















































































































in the time interval d following the bucket t k . Thus for a 
post created by a user in the current time bucket, using 
7 Zd(u) for his audience members provides a better estimate 
of anticipated future reactions. 

These estimates for qi(u) could be computed in multiple 
ways, as described in the following section. Once Q(u) is 
known, we can determine a probability mass function which 
represents a post schedule for the user. These probabilities 
Si(u) can be computed as: 

N 

Si{u) = q i (u)/'^q j (u) (4) 

3 = 1 

Finally, the vector consisting of these probabilities deter¬ 
mine the Post Schedule for the user. Once we have S(u), 
we simply pick the buckets with the highest values of Si(u), 
which are the desired best times to post. Next, we describe 
multiple approaches to compute S(u) using the above nota¬ 
tion and definitions, which are summarized in Table 2. 

5.2 Recommended Schedule Derivation 

To illustrate the when-to-post problem with a concrete 
example, consider a simplified social network graph, as rep¬ 
resented in Fig. 6 . For the user a o, her audience is made up 
of other users bi, so we have: U ou t(a o) = {bo, 6 i, 62 , b m }. 

When ao creates a post, it may be potentially seen by all 
the members bi of her audience. Let us focus on a particular 
audience member bo. This audience member bo also belongs 
to the audience sets of other users ai, and may see posts that 
are created by each of them. We can represent this relation¬ 
ship between the users as: Ui n (bo) = {ao, ai, a 2 ,..., a„}. 



Figure 6 : Simplified representation of a user’s social graph 

We would like to derive the post schedule S(a 0 ) for the 
user ao- In order to do so, we want to answer the following 
question: For the user ao, what is the expected number of re¬ 
actions received from U 0 ut{ao) for a post created in the time 
bucket tk ? We describe two approaches below to answer this 
question and compute the recommended schedule. 

5.2.1 First-Degree Reaction Schedule 

In this approach, we consider the reactions of ao’s audi¬ 
ence U 0 ut(a 0 ), ignoring the second-degree effects of the other 
posting users ai. With respect to Fig. 6 , we consider only 
the left part of the diagram that represents a 0 and U 0 ut(ao) 
(including bo), and ignore all other ai. 

Since we know the reaction profiles R(bj) for the members 
of ao’s first-degree graph, we can accumulate these reaction 
counts per time bucket to get the combined audience reac¬ 
tion profile. However, since this does not take into account 
the post-to-reaction delay, a better approach is to aggregate 
the delayed reaction profiles Rd(bj) for all b, in U ou t(ao ). 

This sum of delayed reactions per bucket gives us the es¬ 
timated audience reaction profile Q(ao) for the user, where 


the elements of the vector are given by: 

m 

qk(ao) = ^2r d}k (bj) (5) 

j =0 

Thus in this case, the probability of receiving a reaction 
in any given time bucket Sk{ao) can then be computed from 
Q(ao) as per Eq. 4. These probabilities determine the First- 
Degree Reaction posting schedule <Si(ao). 

Note that <Si(ao) does not take into account the behav¬ 
ior of an audience member bj with respect to posts from 
other users ai. In other words, this approach only takes into 
account the first-degree dependency for the user ao- We 
therefore describe another approach that takes into account 
the second-degree dependency as well. 

5 . 2.2 Second-Degree Reaction Schedule 

In Fig. 6, the actions of the users ai represent the second- 
degree effects for user ao, since they affect how ao’s first- 
degree connection bo reacts to messages. To consider these 
second-degree effects, we define a Second-Degree Reaction 
schedule £ 2 ( 00 ), which can be derived by answering the fol¬ 
lowing questions first, before the original one above. 

• When do the users ai create posts? 

• When does a specific audience member bo react to the 
posts created by aft 

• What is the probability that bo reacts to a post in a 
certain time bucket t*,? 

The answer to the first question is given by the post cre¬ 
ation profiles C(ai) for each user ai, computed by aggre¬ 
gating the past history of post creation events for the user 
into time buckets. To answer the second question, we first 
compute the reaction profile R(bo). Again, this profile is 
computed by aggregating the past history of reaction events 
for bo, which tells us how often he reacts in any given time 
bucket. The answer to when bo reacts with respect to post¬ 
ing times is then given by the delayed reaction profile Rd{bo), 
which takes into account the post-to-reaction delay. 

For the third question, let p(bo, t k ) be the probability that 
user bo reacts to a post in time bucket t k . This event can 
be modeled as a Bernoulli random variable Xb 0 ,k, with the 
probability of the reaction given by p(bo,t k ), thus: 

E(X bo , k ) = p(b 0 ,t k ) ( 6 ) 

From the point of view of bo, the probability that he reacts 
to some post in the time bucket tk depends on the number 
of posts that he sees, and his usual reaction behavior in t k 7 ■ 

To estimate the number of posts that are potentially vis¬ 
ible to the user bo in each time bucket, we aggregate the 
post creation profiles for all at. The number of posts that 
are actually visible to the user may be modeled as a linear 
function of the total created posts. Thus for a given time 
bucket tk, the number of posts visible to bo is given by: 

n 

v k (b 0 ) = a • ^V fc (ai) +/? (7) 

i =o 

Where a and /3 are constants and c' k (ai) is a rescaled version 
of Ck(ai). These constants may depend on network-specific 

7 Since we are concerned only with the time aspects here, we 
assume that the posts seen by the user are equally likely to 
be reacted upon in all other aspects. 




Table 2: Notation for Action Profiles 


User Action Profile 

Vector 

Notation 

Element 

Notation 

Element Description for user u in time bucket tk 

Created Posts 

C(u) 

Cfc(tt) 

aggregated number of posts created by user 

Visible Posts 

V(«) 

Vk{u) 

aggregated number of posts visible to user 

Self Reactions 

K(u) 

Tk{u ) 

aggregated number of reactions generated by user 

Delayed Self Reactions 

TZ d (u) 

rd,k(u) 

aggregated number of reactions generated by user in the time 
interval d following tk 

Estimated Audience Reactions 

S(«) 

qk{u) 

estimated number of reactions received by user 

Post Schedule 

S(u) 

Sk{u) 

probability of receiving a reaction on a post created by user 


factors, and we assume that the factor is globally applicable 
to all users in a given network. 

With this information, the a priori probability in Eq. 6 
can now be computed as: 

number of delayed reactions by bo in tk 


p(bo,tk) = 


number of posts visible to bo in tk 
rd,k(bo) 


( 8 ) 


Vk(bo) 

Now we turn our attention back to the original user oo- 
Let Y ao ,k to be the random variable representing the number 
of reactions that do receives for a post created in a specific 
time bucket tk- We would like to find the expected number 
of reactions E(Y a0t k), which can be computed as: 


E(Y a 0ifc ) = 


E C£x bj 

j-o 


= 5>(X6,,* 


3=0 


= !>„<.>-E—^—< 9 > 

i=o i=o {a- J2 c 'k( a i) + P) 

i=0 

Thus, these expected values computed from the observed 
lZd(u) and C(u) give us the estimates for the number of reac¬ 
tions received by ao. The elements of the audience reaction 
profile Q(ao) are hence given by: 


qk{a 0 ) = E(Y ao ,k) (10) 

Finally, we can infer the desired posting schedule <S 2 (ao) 
for the user ao as the probability mass function for the dis¬ 
crete random variable Y a0t k- Again, the elements of £ 2 ( 00 ) 
are computed from Q(ao) as per Eq. 4. 


5.2.3 User Weighted Schedules 

In the sums computed above for the first- and second- 
degree schedules, all audience members are treated equally. 
However, audience members may have differing tendencies 
to react to the user’s posts depending on their affinity to the 
user. These differences can be accounted for by associating 
a weight with each audience member who may react to the 
user, computed based on previous actions as follows: 


w(a 0 , bi) 


total reactions received by ao from bi 
total overall reactions received by ao 


( 11 ) 


Eq. 5 can now be modified with this weight as: 

m 

qk{a 0 ) = '^2w(bj,a 0 ) ■ r d 'k(bj) (12) 

3 =0 

Similarly, the expected number of reactions for the second- 
degree schedule in Eq. 9 can also be modified as: 

m 

E(Y a0t k) =^2w(b j ,a 0 ) ■ E{X bj , k ) (13) 

3=0 


We denote these weighted schedules as Si, w (u) and S 2 ,w(u) 
respectively. In Sec. 6 we evaluate the performance of all 
four schedules described above. 


6. SCHEDULE EVALUATION 

In this section, we evaluate the user post schedules derived 
above - <Si(m), Silu) and their respective weighted counter¬ 
parts. We evaluate them on empirical observations of real 
user behavior over a 56-day period for 0.5 million users and 
more than 25 million messages. 

6.1 Baseline Schedules 

Because there are no previous baselines on the when-to- 
post problem, we design two schedules to compare our ap¬ 
proaches with. We consider all users in a given timezone 
and aggregate their behavior to create these baseline sched¬ 
ules. Both the baselines are thus uniquely determined for 
each timezone and are not personalized per user. 

One natural baseline can be created by observing the most 
frequently used time buckets for posting, aggregated across 
all users in each timezone T. We thus obtain our first base¬ 
line, the Most Frequently Used (MFU) Schedule, denoted as 
BS m f u (T), with bucket values bs™* u {T) computed as: 

N 

bs™ fu (T) = ^ Ci (u)/]T ^2 Ci(u) (14) 

u£Ut i= 1 uE.Uj' 

where Ut is the set of users in the timezone T. 

As explained in Sec. 5, the First-Degree Reaction Sched¬ 
ule for a user is based on his first degree audience behavior. 
To generate another baseline for global behavior, we simply 
aggregate the first-degree reaction schedules from all users in 
the timezone. We call this second baseline schedule Aggre¬ 
gated First-Degree (AFD) Schedule, denoted as BS a * d (T), 
whose bucket values are given by: 


N 

bs a i Sd (T)= ^ gj(u)/ y qi(u) (15) 

u^Uj' i= 1 u£Ut 

where Ut is the set of users in the timezone T who have a 
first-degree reaction schedule Q(u). 

Once we have the baseline schedules, we pick the buckets 
with the highest values of bsi(T ) as the best recommended 
times to post for users in timezone T. 

6.2 Evaluation metrics 

For the purposes of evaluation of schedules, we propose a 
ReactionGain metric, which we compute as below. 

Let U be the user sample set under consideration, ob¬ 
served over M days. Let us first consider a single user u in 




this sample. For this user u, we can rank the posting time 
buckets as recommended by a schedule S(u) over a period 
of 24 hours, with the first bucket being the best time to post 
and the last one being the worst. 

For the k th ranked bucket as per <S(u) we compute the 
average reactions per message , RPM(u,k): 

M M 

RPM(u, k) = (E r k,j{n ME c k,j(u)) (16) 

3=1 3=1 


where rk,j(u ) and Ck,j(u) are respectively the reactions re¬ 
ceived and the posts created by the user in the time bucket 
corresponding to the fc-th rank, on the j-th day. As before, 
we compute Vk,j(u) as the reactions received in the first 24 
hours after the posting time. 

We similarly define RPM(u) as the ratio of all the re¬ 
actions received to all the posts created by the user in the 
same 56-day period, across all the time buckets. We now 
compute the ReactionGain, RG(u,k), for the fc-th bucket 
for the user as: 


RG(u, k) 


RPM(u, k) 
RPM{u) 


(17) 


This ratio tells us the increase or decrease in reactions re¬ 
ceived by the user when she posts in a specific bucket, com¬ 
pared to the average reactions per message she receives. 

Finally, we compute the global average reaction gain for 
each bucket RG avg (k) as the average of RG(u, k) values over 
all the users in the sampled population U who created posts 
in that bucket. We use this average reaction gain metric to 
evaluate the schedules below. 


6.3 Real-world Evaluation 

We evaluate real user behavior and measure schedule per¬ 
formance based on how many reactions were received when 
the recommended times were used. 

In our experiments, we sampled 0.25 million active users 
each from Twitter and Facebook from the dataset described 
in Sec 3.3. For each sampled user u, we compute <Si(u), 
S 2 (u) and their corresponding weighted schedules as de¬ 
scribed in Sec. 5, for a 63-day time period. We empirically 
chose the oi and /3 parameters to be both 1.0, and Ck(ai) 
rescaled to c' k (ai) with the mean. We then evaluate the rec¬ 
ommended times on 25 million messages generated by the 
sampled users in a 56-day time period, with no overlap over 
the time period used to derive schedules. 

To compare the performance of the top posting times rec¬ 
ommended by the schedules, we compute the average reac¬ 
tion gain RGavg(k) for the bucket rank k, for each schedule. 
Fig. 7 plots these values for the top 32 buckets for a week¬ 
day 8 , for both Facebook and Twitter. 

We observe from Fig. 7 that the First-Degree Weighted 
Schedule outperforms all the others on both Facebook and 
Twitter. On Facebook, this schedule shows a reaction gain 
of more than 17% in the highest bucket, and on Twitter the 
highest gain is 4%. The second best schedule on Facebook 
is the First-Degree Schedule, while that on Twitter is the 
Second-Degree Weighted Schedule. Both the MFU and the 
AFD baseline schedules show a reaction gain that is slightly 
above 1.0 on Facebook, and mostly below 1.0 on Twitter, 
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Figure 7: Average Reaction Gain for Ranked Buckets 

showing that users who post according to these schedules 
see little to no increase in reactions received. 

Both the second-degree schedules on Facebook show only 
a small reaction gain, very similar to the baseline schedules. 
The superior performance of the first-degree schedules on 
Facebook suggests that second-degree effects on this network 
are less dominant. This may stem from the inherent nature 
of the interactions on Facebook, and the manner in which 
users are shown posts that they could react upon. 

On Twitter, we observe that the weighted schedules for the 
first degree as well as second degree perform better than the 
baselines and the non-weighted ones. Thus the mutual rela¬ 
tionships between a user and his audience members play an 
important role on Twitter in determining the expected reac¬ 
tions. This observation highlights the importance of treating 
each edge in a user graph differently. 

Note that a good recommended schedule should show a 
decreasing trend in reaction gains from the higher to the 
lower ranked buckets, such that posting at the higher recom¬ 
mended times leads to higher reaction gains. The baseline 
schedules fall short in this regard, and show a decreasing 
trend only in the first 10 buckets on Twitter, and none at all 
on Facebook. The global baseline schedules thus prove to be 
less effective in magnitude of reaction gains, as well as order¬ 
ing of buckets, validating our hypothesis that personalized 
recommendations show better performance. 

As an example of recommended schedules, Fig. 8 shows 
the reaction profiles and schedules for a sample user on Twit¬ 
ter. The purple curve in Fig. 8 shows the probability distri¬ 
bution of post-to-reaction delay on Twitter, which is plot¬ 
ted by aggregating reactions observed in a 63-day period. 
Note that this function falls off steeply in the first few hours 
from posting time, and almost vanishes after 12 hours. The 
dashed curve plots the aggregated audience reactions for the 
user, without the post-to-reaction delay. The red and the 
blue curves show the First-Degree Weighted Schedule and 
the Second-Degree Weighted Schedule respectively. The rec¬ 
ommended best times to post over one day and one week 
are the peaks in the plot. 

7. CONCLUSION AND FUTURE WORK 


8 We exclude weekends here since they show diverging be¬ 
havior compared to weekdays, as shown in Sec. 4, but a 
similar analysis can also be performed for weekends. 


In this study, we introduce and formulate a when-to-post 
problem to find the best times to post on social networks in 
order to increase the number of received reactions. 


























































Figure 8: Example Schedules and Filter Function 

We analyze various factors that affect audience reactions 
on a dataset containing over a billion reactions on hundreds 
of millions of messages. We find that a majority of reac¬ 
tions occur within the first 2 hours of posting times on most 
networks. Audience behavior differs significantly on differ¬ 
ent networks, with Twitter having larger reaction volumes 
in shorter time windows as compared to Facebook. We also 
perform location analysis and find interesting similarities 
and differences between cities in terms of reaction patterns. 
Future studies could also study other factors such as content 
and topical relevance of posted messages. 

Further, we present multiple approaches for deriving per¬ 
sonalized posting schedules for users, and compare them to 
two baselines. We evaluate these schedules on empirical data 
from 0.5 million real-world users and 25 million messages ob¬ 
served over a 56-day period. We find that the First-Degree 
Weighted Schedule performs the best among all, providing 
a reaction gain of 17% on Facebook and 4% on Twitter. 
Both first-degree schedules perform better on Facebook and 
both weighted schedules perform better on Twitter. These 
schedules are deployed on a full production system that rec¬ 
ommends posting times to millions of users daily. 

We hope that this study and the accompanying dataset 
provided enables further research in this area. 
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